Temporal Difference Approach to Playing Give-Away Checkers

نویسندگان

  • Jacek Mandziuk
  • Daniel Osman
چکیده

In this paper we examine the application of temporal difference methods in learning a linear state value function approximation in a game of give-away checkers. Empirical results show that the TD(λ) algorithm can be successfully used to improve playing policy quality in this domain. Training games with strong and random opponents were considered. Results show that learning only on negative game outcomes improved performance of the learning player against strong opponents.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Emergence Of Game Strategy In Multiagent Systems

In this paper, we study an emergence of game strategy in multiagent systems. Symbolic and subsymbolic approaches are compared. Symbolic approach is represented by a backtrack algorithm with specified search depth, whereas the subsymbolic approach is represented by feedforward neural networks that are adapted by reinforcement temporal difference TD(lambda) technique. As a test game, we used simp...

متن کامل

Evolutionary-based heuristic generators for checkers and give-away checkers

Two methods of genetic evolution of linear and non-linear heuristic evaluation functions for the game of checkers and give-away checkers are presented in the paper. The first method is based on the simplistic assumption that a relation ‘close’ to partial order can be defined over the set of the evaluation functions. Hence explicit fitness function is not necessary in this case and direct compar...

متن کامل

GOjen: tdGo Temporal Difference Learning of Go Playing Artificial Neural Networks

The original project description has been: An existing Java application handling and visualizing Go games between human and computer players (including trained and evolved ANNs) should be improved and extended with Go playing ANNs trained by temporal difference learning. This extension should serve as a basis for comparisons of td learning with conventional ANN training and evolutionary methods...

متن کامل

Comparison of TDLeaf(λ) and TD(λ) Learning in Game Playing Domain

In this paper we compare the results of applying TD(λ) and TDLeaf(λ) algorithms to the game of give-away checkers. Experiments show comparable performance of both algorithms in general, although TDLeaf(λ) seems to be less vulnerable to weight over-fitting. Additional experiments were also performed in order to test three learning strategies used in self-play. The best performance was achieved w...

متن کامل

Distributed Decision Making in Checkers

The game of checkers can be played by machines running either heuristic search algorithms or complex decision making programs trained using machine learning techniques. The rst approach has been used with remarkable success. The latter approach yielded encouraging results in the past, but later results were not so useful, partly because of the limitations of current machine learning algorithms....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004